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Studies of smoking behavior commonly use the time-line follow- 
back (TLFB) method, or periodic retrospective recall, to gather data 
on daily cigarette consumption. TLFB is considered adequate for 
identifying periods of abstinence and lapse but not for measurement 
of daily cigarette consumption, thanks to substantial recall and digit 
preference biases. With the development of the hand-held electronic 
diary (ED), it has become possible to collect cigarette consumption 
data using ecological momentary assessment {EM A), or the instan- 
taneous recording of each cigarette as it is smoked. EMA data, be- 
cause they do not rely on retrospective recall, are thought to more 
accurately measure cigarette consumption. In this article we present 
an analysis of consumption data collected simultaneously by both 
methods from 236 active smokers in the pre-quit phase of a smoking 
cessation study. We define a statistical model that describes the gen- 
esis of the TLFB records as a two-stage process of mis-remembering 
and rounding, including fixed and random effects at each stage. We 
use Bayesian methods to estimate the model, and we evaluate its 
adequacy by studying histograms of imputed values of the latent 
remembered cigarette count. Our analysis suggests that both mis- 
remembering and heaping contribute substantially to the distortion 
of self-reported cigarette counts. Higher nicotine dependence, white 
ethnicity and male sex are associated with greater remembered smok- 
ing given the EMA count. The model is potentially useful in other 
applications where it is desirable to understand the process by which 
subjects remember and report true observations. 
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1. Introduction. A common technique for eliciting consumption in stud- 
ies of substance abuse is the time-line follow-back (TLFB) method, in which 
one asks subjects to report daily consumption retrospectively over the pre- 
ceding week, month or other designated period. In smoking cessation re- 
search, for example, TLFB is one important method for measuring cigarette 
consumption and defining periods of quit and lapse. 

Although TLFB is a practical approach to quantifying average smoking 
behavior [Brown et al. (1998)], TLFB data can harbor substantial errors 
as measures of daily consumption [Klesges, Debon and Ray (1995)]. TLFB 
questionnaires request exact daily cigarette counts, which smokers are un- 
likely to remember, particularly after several days have passed. Moreover, 
some smokers may understate consumption to avoid the social stigma at- 
tached to excessive smoking or an inability to quit [Boyd et al. (1998)]. 
Thus, smoking cessation studies typically require validation of TLFB re- 
ports of zero consumption by biochemical measurement of exhaled carbon 
monoxide or nicotine metabolites from saliva or blood. 

A second concern is that histograms of TLFB-derived daily cigarette 
counts commonly exhibit spikes at multiples of 20, 10 or even 5 cigarettes. 
This phenomenon, known as "digit preference" or "heaping," is thought to 
reflect a tendency to report consumption in terms of packs (each pack in the 
US contains 20 cigarettes) or half or quarter packs. The heaps presumably 
arise because many smokers do not remember precisely how many cigarettes 
they smoked and therefore report their count rounded off to a nearby con- 
venient number. It has also been hypothesized that some smokers consume 
exactly an integral number of packs per day as a self-rationing strategy [Far- 
rell, Fry and Harris (2003)], but evidence so far suggests that such behavior, 
if it exists, causes only a small fraction of the observed heaping [Wang and 
Heitjan (2008)]. Indeed, Klesges, Debon and Ray (1995) observed that the 
distribution of biochemical residues of smoking is smooth, suggesting that 
heaping is a phenomenon of reporting rather than consumption. 

Recall bias and heaping bias in self-reported longitudinal cigarette counts 
potentially affect estimates of both means and treatment effects. Moreover, 
heaping may lead to underestimation of within-subject variability, thanks 
to smokers who regularly report one pack rather than a precise count that 
varies around some mean in the vicinity of 20. If a large enough fraction of 
subjects in a study are of this kind, estimates of both within-subject and 
between-subject variability can be distorted. 

Although there has been substantial research on statistical modeling of 
heaping and digit preference in a range of disciplines [Heitjan and Rubin 
(1990, 1991), Ridout and Morgan (1991), Pickering (1992), Klerman (1993), 
Torelli and Trivellato (1993), Dellaportas et al. (1996), Roberts and Brewer 
(2001), Wright and Bray (2003) and Wolff and Augustin (2003)], the only 
such application in smoking cessation research is that of Wang and Heitjan 
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(2008) , who described a latent- variable rounding model for heaped univari- 
ate TLFB cigarette count data. They postulated that the reported cigarette 
count is a function of the unobserved true count and a latent heaping be- 
havior variable. The latter can take one of four values, representing exact 
reporting, rounding to the nearest 5, rounding to the nearest 10, and round- 
ing to the nearest 20. Except for "exact" reporters (i.e., those who report 
counts not divisible by 5), one obtains at best partial information on the 
true count and the heaping behavior. They analyzed univariate count data 
from a smoking cessation clinical trial, assuming a zero-inflated negative bi- 
nomial distribution for the true underlying counts together with an ordered 
categorical logistic selection model for heaping behavior given true count. 

The analysis of Wang and Heitjan (2008) has three important limitations: 
first, they included only data from the last day of eight weeks of treatment, 
ignoring the 55 preceding days. Second, they assumed — without empirical 
verification — that reported counts not divisible by 5 were accurate. And 
third, they assumed that the preference for counts ending in or 5 actually 
represented rounding rather than some other form of reporting error. That 
is, a declared count of 20 cigarettes was taken to mean that the true count 
was somewhere between 10 and 30 cigarettes, and was merely misreported 
as 20. In the absence of more accurate data on the true, underlying count, 
attempts to model heaping must rely on some such assumptions. 

Precise assessment of smoking behavior has taken on increasing impor- 
tance as researchers explore the value of reducing consumption as a way 
to lessen the harms of smoking [Shiffman et al. (2002), Hatsukami et al. 
(2002)] and to improve the chance of ultimately quitting [Shiffman, Fergu- 
son and Strahs (2009), Cheong, Yong and Borland (2007)]. The advent of the 
inexpensive hand-held electronic diary (ED) that allows the instantaneous 
recording of ad libitum smoking has created the possibility of making much 
more accurate measurements. Such evaluation is an instance of ecological 
momentary assessment [EMA; Stone and Shiffman (1994)], in that it gener- 
ates records of events logged as they occur in real-life settings. In Shiffman 

(2009) , researchers asked 236 participants in a smoking cessation study to 
use a specially programmed ED to record each cigarette as it was smoked 
over a 16-day pre-quit period; moreover, the ED periodically prompted the 
smokers to record any cigarettes they had missed. At days 3, 8 and 15, 
subjects visited the clinic to complete a TLFB assessment of daily smok- 
ing since the preceding visit (2, 5 or 7 days previously), stating how many 
cigarettes they had smoked each day. The study found that while the TLFB 
data contained the expected heaps at multiples of 10 and 20, the EMA data 
had practically none. Average smoking rates from the two methods were 
moderately correlated (r = 0.77), but the within-subject correlation of daily 
consumption between TLFB and EMA was modest (r = 0.29). Self-report 
TLFB consumption was on average higher than EMA (by 2.5 cigarettes), 
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but on 32% of days, subjects recorded more cigarettes by EMA than they 
later recalled by TLFB. 

These data provide us with an opportunity — unprecedented, so far as we 
know — to study the relationship between self-reports of daily cigarette con- 
sumption by TLFB and EMA. To describe this relationship, we develop a 
statistical model with two components: the first is a regression that pre- 
dicts the patient's notional "remembered" cigarette count (a latent factor) 
from the EMA count. The second is a regression that predicts the rounding 
behavior — described as in Wang and Heitjan (2008) with an ordinal logistic 
regression — from the remembered count and fully observed predictors. The 
models include random subject effects that describe the propensities of the 
subjects to mis-remember their actual consumption (in the first component) 
and to report the remembered consumption with a characteristic degree of 
accuracy (in the second). Assuming that EMA represents the true count, the 
first component of the model allows us to examine the recall bias resulting 
from mis-remembering, while the second component describes the heaped 
reporting errors. 

2. Notation and model. Let Ya denote the observed heaped TLFB con- 
sumption for subject i on day t, i = 1, . .. , n, t = 1,.. . , rrii, and let Y{ = 
(Yn, . . . , Yi mi ) T denote the vector of TLFB data for subject i. Let Xn be 
the EMA consumption on subject i, day t, and let Xi = (Xn, . . . , Xi mi ) T be 
the vector of EMA data for subject i. We furthermore let Z\ = {Zf-, Z^) be 
a vector of baseline predictors for subject i, with Zf 1 representing predictors 
of recall and Zf predictors of heaping. These predictor sets may overlap. 

2.1. A model for remembered cigarette count. The first part of our model 
assumes that for each day and subject there is a notional remembered 
cigarette count, denoted Wu [Wi = (Wn, . . . , Wi mi ) T ]. We assume Wu is dis- 
tributed as Poisson conditionally on a random effect bi, the EMA smoking 
pattern Xa and the covariate vector Zi, with mean 

(2.1) E(W lt \X lt ,Z l ,b i ) = exp(/3 + MX*)Pi + Z ?fo + &0- 

The parameters (3\ and /?2 represent the effects of EMA consumption and 
baseline predictors, respectively, on the latent remembered count. The ran- 
dom effect bi, which we assume normally distributed with mean and vari- 
ance a%, represents heterogeneity among subjects. We note that there are no 
values of Xa in the Shiffman data, which are from a pre-quit study in which 
subjects were encouraged to smoke as normal. Thus, we can include ln(Aj t ) 
as a predictor. In more general contexts where EMA counts are possible, 
one can adjust the model in simple ways to avoid this problem. Moreover, 
when excessive counts occur in the TLFB data, one can fit a zero-inflated 
count model, as in Wang and Heitjan (2008), for the remembered count. 
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2.2. A model for the latent heaping process. Following Wang and Heitjan 
(2008), we assume that a latent rounding indicator Ga [Gi = (Gn, . . . , Gj m J T ] 
dictates the degree of rounding to be applied to the notional remembered 
count Wit- Specifically, we let Gu take one of four possible values: Gu = 1 
implies reporting the exact count, Gu = 2 implies rounding to the nearest 
multiple of 5, Ga = 3 implies rounding to the nearest multiple of 10, and 
Ga = 4 implies rounding to the nearest multiple of 20. We assume that the 
probability distribution of the heaping indicator depends on Wit, a subject- 
level random effect Ui ~ iV(0,<7^) that is independent of hi, and a baseline 
predictor vector Zf- . Specifically, we propose the following proportional odds 
model for the conditional distribution of Gu'. 



Here rjit = Wa n fQ + Z? fy, and q(-) is the inverse logit function q(x) = exp(sc)/ 
(l + exp(x)). The parameters 71 > 72 > 73 refer to the successive intercepts of 
the logistic regressions, 70 refers to its slope with respect to the remembered 
count, and ^3 refers to its slopes with respect to the vector of heaping 
predictors Zf- . The random effect m describes between-subject differences 
in heaping propensity not otherwise accounted for in the model. 

2.3. The coarsening function. As in Wang and Heitjan (2008), the model 
links the observed Yu to the latent Wu and Ga via the coarsening function 



For example, at time t, subject i with Wu = 14 and Gu = 1 reports h(14, 1) = 
14, whereas h(U,2) = 15, h(U,3) = 10, and ft(14,4) = 20. Figure 1 illus- 
trates this heaping mechanism. 

A coarsened outcome yu may arise from possibly several (wit,gu) pairs. 
We denote the set of such pairs as WG(y it ) = {(wu,gu) -Vit = h(w it ,git)}- 
For example, a reported consumption of yu = 5 may represent a precise un- 
rounded value [(wit,gu) = (5, 1)] or rounding across a range of nearby values 
[{wit,git) € {(3, 2), (4, 2), (5, 2), (6, 2), (7, 2)}]. For subject i, the probability 
of the observed yu at time t is the sum of the probabilities of the (wit,git) 
pairs that would give rise to it. The density of reported consumption yu 
given the random effects can therefore be expressed as 



(2.2) f{Git\W lt ,Zi,ui)=l 



' 1 - qiii+rju + Ui), 
q(li + Vit + Ui) - 9(72 + Vit + ih) 
<7(72 + Vit + Ui) - 9(73 + r]it + 

- q(i3 + Vit + Ui), 



if 9 = 1] 
if 9 = 2; 
if g = 3; 
if 5 = 4. 



*(.,-): 



Yit = h(W lt ,Git) 



i = 1, . . . , n,t = 1, . . . , mi. 






{wit,9it)&NG{yu) 
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Fig. 1. Reported cigarette count Y as a function of the underlying count W and the 
rounding behavior G. 



2.4. Estimation. We estimate the model by a Bayesian approach that 
employs importance sampling [Gelman et al. (2004), Tanner (1993)] to avoid 
iterative simulation of parameters. The steps are as follows: we first com- 
pute the posterior mode and information using a quasi-Newton method with 
finite-difference derivatives [Dennis and Schnabel (1983)]. We then approx- 
imate the posterior with a multivariate density with mean equal to the 
posterior mode and dispersion equal to the inverse of the posterior informa- 
tion matrix at the mode. Next, we draw a large number (4000) of samples 
from this proposal distribution, at each draw computing the importance 
ratio r of the true posterior density to the proposal density. We then use 
sampling-importance resampling (SIR) to improve the approximation of the 
posterior [Gelman et al. (2004)]. We evaluate posterior moments by averag- 
ing functions of the simulated parameter draws with the importance ratios 
r as weights. The choice of a t with a small number of degrees of freedom 
as the importance density is intended to balance the convergence of the MC 
integrals and the efficiency of the simulation. 

Letting 6 = (/3o, Pi, fa, fo,^, 71,72,73, Jo,cr u ), the likelihood contribution 
from subject i is 

L(Q;yi)= Yi ^2 f(wit\bi)f(git\w it ,Ui) 
t=i (tw«, 5it )eWG(a«) 

(2.3) 

x f(bi)f(ui)dbidui; 
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we approximate the integral in (2.3) by Gaussian quadrature. We choose 
proper but vague priors for the parameters, which we assume are a priori 
independent (except for ~fj,j = 1,2,3, as noted below). The parameter j3\ 
in the Poisson mixed model (2.1), representing the slope of the latent recall 
on the EMA recorded consumption, is given a normal prior j3\ ~ N(l, 10 2 ), 
whereas the priors of the other regression parameters in both model parts 
are set to iV(0, 10 2 ) subject to the constraint 71 > 72 > 73. We assign the 
random-effect variances inverse-gamma priors with mean and SD both equal 
to 1, a reasonably vague specification [Carlin and Louis (2000)]. We obtain 
the posterior mode and information using SAS PROC NLMIXED, and im- 
plement Bayesian importance sampling in R. 

3. Model checking. With heaped data, the unavailability of simple graph- 
ical diagnostics such as residual plots complicates model evaluation. We 
therefore resort to examination of repeated draws of latent quantities from 
their posterior distributions, in the spirit of Bayesian posterior predictive 
checks [Rubin (1984), Gelman, Meng and Stern (1996), Gelman et al. (2005)]. 
Specifically, we evaluate the adequacy of model assumptions using imputed 
values of the latent recall W, which we compare to its implied marginal 
distribution under the model. 

Imputations of latent Wi and Gi are ultimately based on the posterior 
density f(0\yi) of the model parameter 9 given the observed data y, L . Heit- 
jan and Rubin (1990), sampling univariate y values, used an acceptance- 
rejection procedure to draw quantities analogous to our W and G from a 
confined bivariate normal distribution. In our model, the correlation within 
Wi and Gi vectors poses a challenge to simulation. Note, however, that given 
the subject-specific effects bi and Ui, the components of Wi and Gi are inde- 
pendent. Thus, we can readily simulate (Wi,Gi) from the joint posterior of 
(Wi,Gi,bi,Ui). For each simulated 6 and the observed data yi, the posterior 
distribution of (Wi, Gi, bi, u{) is 

tf 7 \ a\ f{ u ]Q J(yi\wi,gi,bj,Ui,9) 
f{wi,gi,bi,Ui\yi,0) = i{Wi,gi,bi,Ui\6) . 

f{yiW) 

Because the values of wn and gu together determine yu , we have that 

rrii 

f{yi\wi,gi,bi,ui,8) = Y\_I({wu,git) eWG(yjt)), 
t=i 

where / is an indicator function. Accordingly, 

f(wi,gi,bi,Ui\yi,6) 

rrii 

ocf(wi,gi,bi,Ui\6)Y\_I((wit,git)£ WG (y it )) 
t=i 
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f(wi,gi\bi,Ui,e)f(bi,Ui\9)Y\_I(( w it,9it) G WG(y a )) 



t=i 

in, 



f(wi\bi,9)f(gi\wi,Ui, 



t=i 




~[f{wit\bi,9)f(git\wit,Ui,9)I((wit,git) G WG(y it )) 



x f(bi\a b )f(ui\a u ). 



Thus, given random effects bi and Ui, the imputation of (wi,gi) is obtained by 
independent draws of (wu,git), t = l,... ,rrii, which can be implemented as 
an acceptance-rejection procedure. We therefore impute the data as follows: 



(1) Make independent draws, 9^ k \ k = l,...,K from f(9\yi) by SIR. 

(2) Given 9^ k \ for i = 1, . . . ,ra, independently draw b[ ~ N(0,a^ 2 ) and 



type <^ from (2.2). If I((w^,g^) e WG(jte)) = 0, discard (w£ } ,g% } ) and 
repeat this step until I((w^\g^) G WG(y;t)) = 1. 



To assess model fit, we plot K histograms of the imputed latent count w. 
Implausible patterns in these histograms, such as peaks or troughs at mul- 
tiples of 5, suggest incorrect modeling of the heaping. We can also base 
discrepancy diagnostics specifically on the fractions of reported consump- 
tions that are divisible by 5. 

4. Simulations. To examine the performance of our approach, we con- 
ducted simulations replicating the structure of the Shiffman data with m = 
12 nonvisit-day observations per subject. Each data set consisted of n = 100 
subjects, and for simplicity we do not consider baseline covariates. For each 
subject we first set observed EMA count vector from the data 

and generated a random effect 6j ~ N(0, o\ = 0.09). We then generated Wu 
values as independent Poisson deviates with conditional mean (2.1). With 
A) = 2.358, /3i = 0.2628, when b { = and EMA count x it = 20, the mean 
latent recall is 23.2, and when xu = 30 it is 25.8. With the random effect 
distributed as designated above, the marginal mean recalls for xu = 20 and 
xu = 30 are 24.3 and 27.0, respectively. 

Next we generated the latent heaping behavior indicator Ga from (2.2). 
We set the parameters to their estimates from the Shiffman data: the in- 
tercepts 7i, 72, 73 were —1.485, —5.280 and —10.141, respectively, and the 



0) jvr/n 0)^ 

(3) For i = l,...,n, given 9^ an 
Poisson with mean (2.1). Then given 




SELF-REPORTED CIGARETTE CONSUMPTION 



9 



slope 70 was 0.1098. We simulated the random effect ui ~ N(0,a^ = 7.1). 
Under this setting, when m = and Wn = 22, the probability of exact report- 
ing is 28.3%, and the probabilities of rounding to the nearest multiples of 5, 
10 and 20 are 66.3%, 5.4% and 0.04%, respectively. When the latent count 
wu = 36, these probabilities are 7.8%, 71.2%, 20.8% and 0.2%, respectively. 
The simulated latent wu and gn determined yu as illustrated in Figure 1. 

These parameter values allow for considerable discrepancy between re- 
membered and recorded consumption. To examine our methods when the 
latent recall and EMA match more closely, we conducted a second simula- 
tion under parameter values that gave better agreement. In this scenario, 
we assumed f3o = and j3\ = 1 with 6j ~ iV(0,0.05). Thus, when 6j = 0, the 
expected precise recall E(wu) = xu, and the marginal mean recalls are 20.5 
and 30.8 for EMA counts of 20 and 30, respectively. We set the parameters 
in the heaping behavior models at —1.07, —4.37, —6.52 and 0.088 for 71, 
72, 73 and 70, respectively, and a\ = 5.9. In this case, when uu = 0, the 
probabilities of reporting exactly and to the nearest multiples of 5, 10 and 
20 for a true count of 22 are 29.6%, 62.3%, 7.1% and 1%, respectively. 

Table 1 presents summaries of 100 simulations of estimates of the pa- 
rameter 6 = (A)) 71,72,73,70,0m). Under both scenarios, the MLEs of 
the fixed-effect coefficients fell near the true values on average, with no more 
than 0.5% bias for the parameters in the recall model and no more than 2.7% 
bias for those in the heaping model. The random effects variance estimates 
are also well estimated, with bias less than 1%. The coverage probabilities of 
nominal 95% confidence intervals range from 93% to 98%, except for 73 in 
case 1, where coverage is only 80%. The poor coverage rate for this param- 
eter is a consequence of instability in the inverse Hessian matrix; it can be 
improved by creating parametric bootstrap confidence intervals (Table 2). 
The simulation shows good performance of the MLEs, and, as the sample 
size is large, we expect the Bayesian estimates to behave similarly. Moreover, 
the maximization part of the MLE calculation can help identify multimodal- 
ity of the likelihood, should it occur, and singularity of the Hessian that we 
use in the Bayesian sampling. 

5. Data analysis. We applied the method of Section 2 to the Shiffman 
data, with the aim of evaluating our posited two-stage process as an expla- 
nation for the discrepancy between actual and reported consumption. To 
focus on the link between the self-report and true count, our first analysis 
included only log EMA count in (2.1) and a visit day indicator in (2.2). The 
latter is important because it seems reasonable that distance in time from 
the event would be a strong predictor of heaping coarseness. Our second 
analysis expanded the recall model to include a range of baseline character- 
istics: demographics (age, sex, race and education); addiction; measures of 
nicotine dependence [the Fagerstrom Test for Nicotine Dependence (FTND) 
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Table 1 

Results of 100 simulations of the mis-remembering/heaping model 
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and the Nicotine Dependence Syndrome Scale (NDSS)]; and EMA compli- 
ance measured as the daily percentage of missed prompts. Age, education, 
FTND and EMA compliance are considered as quantitative variables, sex 
and race are binary indicators, and addiction is a categorical variable taking 
three levels (possible, probable and definite). They are the first variables 
that a smoking researcher would think to investigate, and could potentially 
affect remembered count or heaping probability. The two measures of nico- 
tine dependence FTND and NDSS showed only a modest correlation, with 
Spearman r = 0.56 in our data. So we considered both in the model. The 
data set and programming code are included in the supplementary materials 
[Wang et al. (2012)]. 

5.1. Evaluating goodness of fit. We evaluated model fit by creating mul- 
tiple draws from the posterior predictive distribution of latent quantities as 
discussed in Section 3. Lack of smoothness in the histogram of the imputed 
latent count would suggest an inadequate heaping model. 

We evaluated goodness of fit for the model that includes log EMA count 
in (2.1) and a visit day indicator in (2.2). The top row in Figure 2 displays 
the histograms of TLFB cigarette consumption at days 3 (a visit day) , 9 and 
14. The spikes at 10, 15, 20, 25, 30, etc. are characteristic of self-reported 
cigarette counts [Wang and Heitjan (2008)]. As many as 70% of subjects 
reported cigarette smoking in multiples of 5 for nonvisit-day consumption, 
whereas for the visit day (day 3) that number is only 48%. Only 1/4 of the 
counts on the visit day ended in 0. 

The next three rows represent independent draws of the latent count 
Wit- The spikes at multiples of 20, 10 or 5 have disappeared. Compared to 
the self-reported count, the percentage of subjects whose exact counts are 
divisible by 5 (or 10 or 20) is smaller and consistent across time. Averaged 
over three imputations, the fraction of counts ending in multiples of 5 is 
27%, 25% and 23% on days 3, 9 and 14, respectively, and 15%, 14% and 
12% end in multiples of 10. These checks indicate that our model offers a 
plausible explanation for the heaping. 

5.2. The fitted model. In order to assess the impact of the assumed cor- 
relation structure, we fit the model as proposed in (2.1) and (2.2) and also 
a model that excludes random effects. Posterior modes and 95% credible 
intervals (CIs) appear in Tables 3 and 4. The estimates in both the remem- 
bered count model that characterizes the latent recall process and the heap- 
ing behavior model are sensitive to the assumption of random effects. The 
Bayesian information criterion (BIC) of the model with two random effects 
is 14,705 when including EMA as the only predictor and 14,059 when includ- 
ing EMA and the baseline patient characteristic predictors. The BICs for 
the corresponding models excluding random effects are 18,340 and 16,641, 
respectively. Thus, the evidence is overwhelming that the mixed model is 
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Fig. 2. Top row: histogram of self-reported cigarette consumption. Lower three rows: 
histograms of draws from the posterior distribution of the latent exact consumption recall. 



preferable. Furthermore, we included the patient characteristic predictors as 
covariates in both the remembered count model and heaping process model, 
but this model (BIC = 14,079) is less favorable compared to the model with 
the covariates in just the latent remembered count model. None of these 
predictors is significant in the heaping process model (results not shown). 
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Table 3 

Estimated parameters from the Shiffman data under simple models for recall (EMA only) 
and heaping (remembered count and visit day indicator) 



Random effects model Independence model 



Posterior Posterior 
Parameter mode 95% CI mode 95% CI 



Latent recall: Poisson model 



Intercept: /3o 


2.32 


[2.24,2.40] 


1.14 


[1.09, 1.20] 


ln(EMA): /3i 
°l 


0.27 


[0.25,0.30] 


0.68 


[0.66, 0.69] 


0.09 


[0.08,0.11] 






eaping behavior: proportional odds model 








Intercept 1: 71 


-1.50 [ 


-2.17,-0.85] 


-1.06 


[-1.30,-0.84] 


Intercept 2: 72 


-5.21 [ 


-6.14,-4.43] 


-2.94 


[-3.26,-2.65] 


Intercept 3: 73 


-10.15 [- 


-12.49,-8.48] 


-4.17 


[-4.59,-3.82] 


Exact count (latent): w 


0.11 


[0.09,0.13] 


0.07 


[0.06, 0.08] 


Visit day 


-2.96 [ 


-3.50,-2.50] 


-1.29 


[-1.54,-1.06] 


"n 


6.65 


[5.12,9.08] 







The 95% CI of j3\ is [0.23, 0.28], indicating that remembered consumption 
is positively associated with recorded EMA consumption. In addition, base- 
line patient characteristics FTND, NDSS, race and gender have significant 
effects on the recall process. For fixed EMA count, the following charac- 
teristics are associated with greater remembered smoking: higher nicotine 
dependence (measured by both FTND and NDSS), white ethnicity (com- 
pared to black) and male sex. 

Figure 3 displays the estimated curve of the mean of Wn against the EMA 
count. A natural hypothesis is that the estimated latent mean agrees with 
EMA, which would be reflected in the Poisson model by an estimated inter- 
cept of and slope of 1; one might call this a model of unbiased memory. To 
the contrary, Figure 3 shows that the fitted mean curve diverges substan- 
tially from the 45° line, with the lighter smokers on average overestimating 
their consumption and the heavier smokers underestimating consumption. 
The mean remembered consumption agrees with the true count roughly in 
the range 22-26 cigarettes, or slightly more than a pack per day. 

Figure 4 shows the estimated heaping probability as a function of remem- 
bered cigarette consumption for visit and nonvisit days. The possibility of 
rounded-off reporting increases rapidly as the remembered count increases, 
although surprisingly the probability of rounding to the nearest 20 is not 
large for either type of day. When the perception of smoking is more than 
two packs, say, 41 cigarettes, the chance of heaped reporting rises to more 
than 84%, of which 37% is attributed to half-pack rounding. The results 
confirm that the degree of heaping is much smaller on visit days. For exam- 
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Table 4 

Estimated parameters from the Shiffman data under an expanded model for recall 



Random effects model Independence model 



Parameter 


Posterior 
mode 


957c CI 


Posterior 
mode 


957c CI 


Latent recall: Poisson model 










Intercept: /3o 


2.34 


[2.21,2.49] 


1.51 


[1.44,1.58] 


In (EM A): /3i 


0.25 


[0.23,0.28] 


0.53 


[0.51,0.55] 


Addicted 










Possible vs. definite 


0.07 


[-0.10,0.24] 


0.05 


[0.01,0.09] 


Probable vs. definite 


-0.01 


[-0.11,0.08] 


-0.02 


[-0.04,0.006] 


FTND 


0.06 


[0.04,0.08] 


0.04 


[0.03,0.05] 


NDSS 


0.08 


[0.05,0.12] 


0.05 


[0.04,0.06] 


EMA compliance 


0.13 


[-0.28,0.51] 


0.39 


[0.29,0.49] 


Age 


0.002 [- 


-0.001,0.006] 


0.003 


[0.002,0.004] 


Race (black vs. white) 


-0.14 [• 


-0.27,-0.01] 


-0.06 


[-0.10,-0.03] 


Sex (male vs. female) 


0.16 


[0.10,0.23] 


0.12 


[0.09,0.23] 


Education 


-0.001 


[-0.03,0.02] 


0.003 


[-0.004,0.009] 


*t 


0.06 


[0.05,0.07] 






Heaping behavior: proportional odds model 








Intercept 1: 71 


-1.62 [■ 


-2.35,-0.90] 


-1.14 


[-1.37,-0.91] 


Intercept 2: 72 


-5.52 [■ 


-6.42,-4.61] 


-3.15 


[-3.47,-2.82] 


Intercept 3: 73 


-10.31 [- 


-12.65,-8.37] 


-4.54 


[-4.99, -4.08] 


Exact count: w 


0.11 


[0.09,0.14] 


0.07 


[0.06,0.08] 


Visit day 


-2.99 [• 


-3.51,-2.47] 


-1.26 


[-1.50,-1.02] 


"i 


6.79 


[4.73,8.68] 







pie, only 51% of subjects round off the visit-day count when reporting 41 
cigarettes, and among those 39% round off to the nearest multiple of 5. 

6. Discussion. We have developed a model to describe the process whereby 
exact longitudinal measurements become distorted by retrospective recall. 
Our approach uses latent processes to explain the data as a result of mis- 
remembering and rounding: a model of the latent exact value describes 
subject-level recall and allows for association over time and with baseline 
predictors, while a misreporting model describes the dependence of heaping 
coarseness on the latent value and other predictors. Random effects repre- 
sent individual propensities in recall and heaping; in our data, inferences 
depend strongly on the inclusion of these random effects. 

The data suggest that both mis-remembering and heaping contribute sub- 
stantially to the distortion of cigarette counts. The curve of mean remem- 
bered count as a function of EMA count departs markedly from the 45° line, 
with lighter smokers overstating consumption and heavier smokers under- 
stating consumption. The remembered smoking coincides with the accurate 
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x Estimate of latent recall W 
— 95% confidence bounds 
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Cigarette Consumption by EMA 

Fig. 3. Estimate of the conditional mean of recalled count given EMA count in 
the Poisson mis-remembering model. Covariates are fixed at education — high school, 
addicted — definitely, race = white, sex = female, and mean values of the quantitative pre- 
dictors: FTND = 5.97, NDSS= -0.023, age = 43.5, and EMA noncompliance = 10.1%. 



EMA count at around 24 cigarettes, suggesting that the popularity of re- 
porting one pack per day is partially a result of the general heaping behavior 
rather than a particular affinity for remembering a pack a day. The curves of 
heaping probabilities suggest that exact reporting is uncommon and prac- 
tically disappears beyond about 40 cigarettes/day. Nevertheless, it is inter- 
esting just how much of the misreporting is due to mis-remembering. The 
remembered cigarette consumption depends not only on true consumption, 
but also on the subject's sex, race and degree of nicotine dependence. 

The interpretation of our model components as representing memory and 
rounding depends on the assumption that EMA data are exact. Of course, 
even EMA data are subject to errors, as smokers may neglect to record 
cigarettes both at the time of smoking and later. Yet good correspondence 
with smoking biomarkers strongly supports the use of EMA over TLFB as 
a proxy for the truth [Shiffman (2009)]. 

We have implemented our model with a combination of standard nu- 
merical methods including Gaussian quadrature, quasi-Newton optimization 
and sampling-importance resampling. Our experience suggests that with the 
model as specified, and incorporating a modest numbers of predictors, the 
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FlG. 4. Estimated rounding behavior given EMA count in the proportional odds heaping 
model. 



method is robust and efficient. Increasing the number of random effects 
would increase the time demands (from the numerical integration) and raise 
the possibility of numerical instability (from possible errors in integration). 
For more extensive models, sophisticated approaches based on MCMC sam- 
pling would be necessary. 

Our model allows for the inclusion of covariates to better explain the 
discrepancy between smokers' self-perceived behaviors and reality. It also 
provides a basis for predicting true counts (effectively the EMA data) from 
reported TLFB counts. This would be a valuable activity in the large number 
of studies that do not collect EMA data. To predict true counts from the 
recalled counts, we first need to estimate the parameters 9 in the model using 
a subset of the primary study or an external independent study that collects 
both TLFB count Y and accurate EMA count X. Then we can impute the 
true count together with the latent remembered count and heaped reporting 
behavior. Specifically, the posterior distribution of (Wi,Gi,Xi,bi,Ui) is 

f(wi,gi,Xi,bi,Ui\yi,0) 

= f(wi,gi,Xi,bi,Ui\0) 
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(rrii 
Y\f(wit\xit,bi,9)f(g it \wi t ,Ui,e)I((wit,git)e WG (y it )) 
t=i 

x f{xi)f{h\a b )f{ui\a u ), 

where f(xj) is the density function of the true count. Imputation follows 
similar steps as described in Section 3 with set equal to the maximum 
likelihood estimates. 

The methods developed here also can have application in a wide variety of 
settings in social and medical science involving self-reported data — for exam- 
ple, assessing sexual risk behavior, trial drug consumption, eating episodes 
and financial expenditures. 

Acknowledgments. We are grateful to two Associate Editors and a ref- 
eree, whose perceptive comments and suggestions greatly improved the pa- 
per. 

SUPPLEMENTARY MATERIAL 

Data and programming code for the analysis 

(DOI: 10.1214/12-AOAS557SUPP; .zip). It contains the daily TLFB and 
EM A data set, and SAS and R code to implement the method. 
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